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1.  Introduction 


Optimization  analyses  within  firms  usually  presuppose  the 
specification  and  estimation  of  a  functional  which  describes  terminal 
actions  and  state  variables  in  units  of  some  criterion,  such  as 
operating  costs.  In  most  operations  research  studies  optimization 
problems  have  the  character  of  maximization  or  minimization.  For 
example,  in  linear  programming  resource  allocations,  one  seeks  to 
minimize  operating  cost  expressed  as  a  linear  function  of  the  levels 
of  different  activities  and  a  given  set  of  cost  coefficients,  subject 
to  a  defined  system  of  constraints.  Within  the  programming  framework 
one  can  explore  the  sensitivity  of  proposed  solutions  to  errors  in  the 
estimation  of  model  coefficients  with  the  aid  of  a  post-optimality 
analysis  using  parametric  techniques.  However,  the  task  of  obtaining 
initial  estimates  of  all  coefficients  remains  with  the  analyst's  best 
judgment.  Moreover,  in  many  operations  research  investigations,  such 
as  those  employing  non-linear  programming  models,  the  form  of  the 
functional  to  be  optimized  (or  relations  within  the  constraint  set) 
is  not  known  a  priori  and,  consequently,  a  statistical  analysis  of 
environmental  and  historical  data  must  precede  the  normative  development 
of  explicit  decision-making  procedures.  In  this  context  the  applied 
scientist  often  must  develop  the  theoretical  model  and  complete  the 
empirical  analysis  to  successfully  implement  his  recommendations. 

A  recent  paper  by  Professor  Theil  [9]  discusses  some  of  the 
general  interactions  between  the  fields  of  econometrics  and  management 
science.  Clearly,  at  one  time  or  another  the  operations  research  or 
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management  scientist  is  both  a  theoretician  and,  like  the  econometrician, 
an  empiricist.  In  contrast  to  discussions  on  theory,  the  literature  on 
empirical  problems  in  operations  research  is  notably  sparse.  The 
purpose  of  this  report  is  to  illustrate  the  overlap  between  data 
analysis  problems  in  operations  research  and  the  theoretical  development 
of  a  model.  The  case  in  point  to  be  considered  is  statistical  cost 
estimation  in  quadratic  programming  models. 

In  the  next  section  we  review  the  basic  characteristics  of 
quadratic  programming  analysis  and  then  introduce  an  example  model 
based  on  the  study  of  linear  decision  rules  for  production  planning 
by  Holt,  Modigliani,  Muth ,  and  Simon  [2],  The  discussion  then  proceeds 
to  consideration  of  alternative  approaches  to  estimating  the 
coefficients  in  a  specified  objective  function  based  on  operating  costs. 
Within  this  framework  several  general  approaches  are  discussed,  including 
simple  multiple  regression,  the  application  seriatim  of  single 
equation  techniques  to  relations  in  the  model,  and  the  simultaneous 
estimation  of  equation  systems,  as  with  k-class  estimates  in 
econometrics.  In  conclusion,  some  practical  considerations  for  error 
analysis  in  estimation  and  sensitivity  analysis  in  optimization  are 
reviewed. 

2.  Background:  Linear  Decision  Rules  and  Quadratic  Programming 

Quadratic  programming  problems  concern  the  optimization  of  a 
quadratic  objective  function  subject  to  a  linear  system  of  constraint 
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equations  (or  Inequations).  Optimization  problems  with  this 
mathematical  structure  have  arisen  in  various  applied  areas,  such 
as  capital  budgeting  in  investment  portfolio  analysis,  aggregate 
production  and  employment  scheduling,  and  so  on  (e.£,.  ,  see  Boot  [1]). 

The  basic  problem  of  minimizing  (or  maximizing)  a  quadratic 
function  with  respect  to  a  column  vector  of  actions  can  be  stated  as 

[1] 

where  the  vectors  a  and  x  are  of  dimensions  (nxl),  Xq(x)  is  a  scalar 
function  of  x,  X_(x)  is  a  vector  function  of  x,  such  as  \(x)  «  D  x  for 
D  an  (mxn)  dimensional  matrix  of  rank  m  <  n,  and  g  is  a  non-singular 
matrix  of  dimensionc  (nxn).  Let  b/ba  denote  taking  partial 
derivatives  with  respect  to  the  column  vector  a.  A  necessary  condition 
for  a  =  a*  to  be  a  local  minimum  of  [1]  is  that 

“  AOO  "  2  ’  t2] 

spa* 

or  equivalently  that 

“ “2  1  X(x)  *-<3  *D  x  .  [3] 

A  sufficient  condition  that  (3]  be  the  global  minimum  is  that 
c(a,x)  be  convex  or,  specifically,  that  g  is  positive  definite. 

The  addition  of  linear  equality  constraints  to  the  problem  in  [1] 
can  be  incorporated  within  this  framework  with  little  difficulty.  For 


a 

da 


c(a,x) 


min  c(a,  x)  *  \  (x)  +  2a'  \(x)  -  a'Q 
{a}  "  " 
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example,  suppose  the  original  problem  is 

min  c(a,z)=k+2^'a+2£'z  -  (a'  A  a  z'  Bz+a'Cz+z*  C'a)  [4] 
{a}  "  “  *  "  " 

subject  to:  z_  “  R  a  +  x  ,  [5] 

where  ?.  is  a  (tyn)  'Umer.ricr.ed  matrix  of  full  row  rank  and  t  <  n.  The 
problem  stated  in  [4]  subject  to  [5]  can  be  reformulated  as  the  problem 
posed  in  [1]  by  substitution  of  [5]  into  [4]  for  the  vector  z.  That  is, 
the  relations  in  [1]  become 

X  (x)  -  k  +  2£'x  +  x'B  x 
o  —  —  «=*  — 

i  (x)  =  x  +  +  %'V  - 

Q  “  A  +  R'BR  +  CR  +  R'C 

s?  =  =  sxs  mm  ac  s 

t 

The  solution  to  the  problem  as  now  stated  is  identical  to  that  given 
by  [3]. 

If  the  system  of  constraints  in  [5]  is  one  of  inequality 
relations ,  such  as 

Ea+x<£,  [7] 

where  E  is  a  matrix  of  full  row  rank  with  (txn)  dimensions,  and  t<h, 
the  above  procedure  can  still  be  employed  through  the  introduction  of 
a  (txl)  dimensioned  slack  vector  w.  That  is,  write 

Ea_+Iw+X“£  ,  or 
o  o 

R  a  +  x  ■  z_  , 


[8] 
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where  R~  =  [El]  and  a  =  [a  w]!.  Then  appropriately  augmenting  the 

— >  a  is  """ 

statements  for  X(x)  and  g  in  [1]  to  conform  to  the  (rvft)  dimensions  of  a° 

the  solution  sequence  can  proceed  as  before  to  obtain  (a°*)  **>  a*,  that 

g  o 

is,  we  now  have  X°(x)  =  (X(x)',  0')'  and  0°  «  [  ]  for  0°  a  square 

—  »  O  O  = 

sc  sc 

(nft)  matrix. 


Similar  procedures  can  be  introduced  to  handle  non-negativity 
restrictions  on  a,  that  is,  >  0  for  i-1,  2,  ,  r».  Tor  example, 

the  initial  problem  specification  in  terms  of  a^  might  be  redefined 
in  units  of  an  arbitrary  norm,  say  a^(N),  so  that  a^  *  a^CN)  -  a^ 

a*' l  Lne  new  variables  a^  for  i«l . n  are  unrestricted  in  sign.  In 

general,  quadratic  programming  problems  are  a  special  case  of  non-linear 
convex  programming  and  can  be  solved  by  reference  to  the  Kuhn-Tucker 

y 

theorem  which  gives  necessary  and  sufficient  conditions  for  an  optimal 
solution  vector  a*. 

Referring  to  the  general  solution  for  a*  in  equation  [3]  above 
we  note  that  this  expression  can  be  written  simply  as 


a*  -  K  x  , 

^  n 

or  a  -  Z.  .  k..  x  for  i»l,  2,  ...,  n  . 

1  ij  j 

That  is,  the  procedures  which  determine  the  optimal  actions  under  the 
quadratic  programming  problem  are  simple  linear  decision  rules  whose 


y  See  Kuhn  and  Tucker  [6].  In  the  discussion  by  Boot  [1]  a  number  of 
computational  algorithms  for  solving  quadratic  programming  problems 
are  detailed.  Several  data  processing  equipment  manufacturers  have 
quadratic  programming  computer  codes  available  based  on  these 
algorithms. 


w 
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arguments  are  the  state  variables  xt  e  x,  typically  net  controlled  by 


the  decision-maker. 


3.  Specification  of  Quadratic  Cost  Functionals 

From  the  above  we  see  that  for  decision  problems  which  c an  be 

expressed  in  the  form  of  [1],  the  general  solution  is  that  given  by 

[3],  or  equivalently  [9].  To  illustrate  this  class  of  problems,  and 

the  corresponding  issues  In  estimation,  consider  the  aggregate  planning 

1/ 

problem  first  studied  by  Holt,  Modigliani,  Muth ,  and  Simon  In  [2]. 

The  decision  problem  In  the  HMMS  model  was  to  find  for  an  Individual 
firm  production  and  work  force  levels,  Pt  and  for  twl,  2,...,T 
perioefc,  that  minimize  expected  total  coats  E[C(T)] ,  where 


C(t)  *  Ztm\  ct 

and  Ct  “tcu+  C^2  Wt 


(the  sum  of  operating  costs  in  each  period),  [10] 


(regular  payroll  costs  . . . 


[10-1]) 


+  C2i(Wt  -  Wt_1  -  C22)Z  (hiring  and  layoff  costs  ...  [10-2]) 

+  C_.  (P  -  C-2  W  )2  +  C..P  -  C,,W  +  C..PW  (overtime 

31  t  32  t  33  t  34  t  35  costs....  [10-3]) 

2 

+  C,,(I_  -  C,0  -  C, -  Sfc)  ]  ,  (inventory  connected  coats...  [10-4] 
ol  C  OZ  DJ  i 

subject  to  the  restrictions  that 


Tt-1  +  Pt  -  St  ’  Xt 


Pfc  and  >  0 


t  ■  1,  2 ,  • . • ,  T  j 


1/  In  subsequent  discussion  the  work  by  these  authors  will  be  referred 
to  as  the  "HMMS  model." 
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where  sales  Sf  is  a  stochastic  variable  with  known  probability  distribution 
for  all  t,  I  represents  the  ending  inventory  balance  for  all  t,  and 
the  cost  coefficients  c^  are  known  or  can  be  estimated.  By  expanding 
the  relations  in  [10]  and  regrouping  terms  we  see  that  [10]  is  a 
3pecial  case  of  [4];  similarly,  writing  the  inventory  constraint  as 


I.  =  I  +  E  \  P  -  E  S  ,  t=l,2 . T 

t  O  T=  1  T  T»1  T 


we  see  that  [11]  is  a  special  case  of  [5].  Hence,  given  cost 
coefficient  values,  the  mathematical  problem  in  [10]  and  [11]  can  be 
solved  using  the  previous  analysis.  The  operational  problem  is  then: 

How  can  best  estimates  of  the  cost  coefficients  be  obtained  using 
available  company  data?  More  basically,  one  might  ask:  How  can  the 
best  specification  of  the  cost  relations  in  [10-1]  through  [10-4]  be 
determined? 

For  example,  referring  to  the  specification  for  hiring  and  layoff 
costs  in  [10-2],  several  alternative  specifications  could  have  been 
considered,  such  as 

(Hiring  and  layoff  cost)^  *  C ^  H^  +  t=*l,2,...  ,T  [12-1] 

or 

(Hiring  and  layoff  cost>2  =  C25^Ht  "  Ft^+  C26^Ht+Ft^  ’ t=1  ’2  ' ’  * '  ,T 

where  Ht  corresponds  to  workers  hired  in  period  t  and  Ffc  similarly  for 
workers  laid  off  and 

-  W  +  E  H  -  E  C,  F  ,  fl,  2,. . .  ,T. 

t  O  T=1  T  r=l  T 


r 
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AL  ternat ive  specifications  might  be  considered  for  cost  components 

1/ 

f 10-3]  and  [10-4]  as  well. 

Arguments  for  the  specifications  of  the  HMMS  model  chosen  in 
r 10 1  ire  detailed  in  Chapters  2,  3.  anu  9  of  reference  [2]  and  are 
analyzed  further  in  Van  de  Panne  and  Bosje  [10],  In  the  Interests 
of  brevity  we  will  not  review  this  discussion  here.  Suffice  it  to  say 
that  for  the  company  environment  analyzed  the  HMMS  model  specification 
is  as  reasonable  as  any  alternative,  and  perhaps  more  preferred.  However, 
the  arguments  and  rationale  for  this  specification  may  lose  appeal 
when  considering  a  different  environment.  In  this  regard,  the  applied 
scientist  must  exploit  the  statistical  properties  of  his  empirical 
investigation  for  guidelines. 


To  illustrate  this  last  point  in  some  detail,  we  introduce  an 

alternative  model  based  on  the  HMMS  study  which  was  first  discussed 

in  Kriebel  [4j.  Referring  to  this  case  as  the  HIA  model,  the  initial 

HIA  specification  is  as  follows:  find  non-negative  production  and  work 

force  levels,  P  ^  and  V/  for  1=1,2, 3  locations  and  t«l,2,...,T  periods 
it  t 

which  minimize  expected  total  costs  E[C(T)]  where 


C(T)  =  Ct  ,  (the  sum  of  operating  costs  in  each  period) 


and 


Ct=  fCll  +  C12  Lt 


(regular  payroll  costs  . . . 


+  C2i(Wt  -  +  C22)  (hiring  and  layoff  costs  ... 

2 

+  C^(Pt  -  C32Lt  +  ^33)  +  ^34  (overtime  costs  ... 

2  2 

+  C^j(Wt  "  Lt)  +  C5i^pt  "  pt_i  +  C52^  (other  variable 

production  costs. 


[13] 

[13-1]  ) 
[13-2]  ) 
[13-3]  ) 
[13-4]  ) 


The  specification  for  regular  payroll  cost  given  in  [10-1]  is  perhaps 

the  most  difficult  to  improve  upon,  given  the  ordinary  accounting 
procedures  of  most  firms. 


.  /  ■ 


v w  ,.jw  .'■•mm  *Bn w 1 1  *  1 1  — — — — - —  — 

#* 


+  Ei-1  c61i(lit  "  C62i 
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2 

C,~.  S.  )  ]  (inventory  connected  costs 

4  L  at  al’  locations  ...  [13-5]  ) 


subject  to  the  restrictions  that 


ht '  ht-i  +  pit  -  sit  1'1'2'3 


L  -  W  -  r 
t  t  t 


for  t**l  ,2 ,  ....  T 


[14] 


P  =  E.  .  P. 
t  i=l  it 


The  variable  L,.  represents  the  number  of  direct  labor  employees  actually 
reporting  for  work  within  a  particular  time  period  t,  and  is 
stochastically  determined  for  each  period  by  Wfc  and  the  value  of  rfc , 
corresponding  to  the  number  of  absentees.  The  subscript  i  on  I  , 

P^t  and  serves  to  identify  three  separate  locations  where  inventory 
is  stored  and  bales  transactions  occur.  With  the  exception  of  the 
overtime  cost  specification  In  equation  [13-3]  and  the  inclusion  of 
the  relation  in  [13-4],  the  specification  of  the  HLA  model  in  [13]  is 
directly  compatible  with  the  HMMS  model  in  [10].  Equation  [13-4], 
labeled  "other  variable  production  costs"  consists  of  two  expressions, 
one  corresponding  to  an  absenteeism  cost  component  and  the  other 
corresponding  to  a  cost  component  associated  with  changing  production 
levels.  It  is  apparent,  from  the  preceding  discussion  that  the 
mathematical  problem  in  [13]  and  [1-4]  is  one  of  quadratic  programming 
and  that  the  HLA  specification  can  be  formulated  to  provide  a  solution 
as  given  by  [9].  For  example,  if  we  partition  the  action  vector  by 
time  periods,  such  that  a*  =  (a^ ,  ...,  a^ ,  ...,  a^)  where  * 

(Plt.  P2l,  P,t,  Wt)  for  t-1.  2,  ....  T. 


We  now  proceed  to  an  empirical  analysis  of  the  HLA  model  and  the 
determination  of  estimates  for  the  coefficients  of  the  specification 


in  [13],  given  historical  data  on  costs  and  the  decision  and  state 
variables.  To  simplify  this  discussion,  however,  we  will  omit 
consideration  of  the  inventory  connected  costs  component  [13-5]  in  [13],  since 
the  added  complication  introduces  no  new  issues  for  the  empirical  analysis. 
Thus,  in  subsequent  discussion  we  refer  to  the  HLA  model  cost  specification 
simply  as  [13b] 

C(T>  -  [Cu  +  C12Lt  +  C21  (Wt  -  Wt_1  +  C22)2+  C31  (Pt  -  C32Lt  *0^ 

+  C34  +  C41(Wt  "  V  +  C51  (Pt  "  Pt-l  +  C52)  ^  ’ 


where  all  of  the  previous  definitions  apply. 

1/ 

4.  Single  Equation  Estimation  of  Cost  Coefficients 
In  conjunction  ith  the  preliminary  analysis  of  the  HLA  company 
environment  which  .  to  the  cost  specification  above,  data  was  obtained 
on  all  variables  and  costs  -overing  a  history  of  approximately  fifty 
consecutive  rime  periods.  As  a  first  approach  to  obtaining  coefficient 
estimates,  a  simple  linear  regression  model  was  hypothesized  of  the  form 

Yt  “  “  +  xlt  +  h  x2t  +  •  '  ■  +  S7  x7t  +  €t  [l5al 


for 

Ct  =  A  +  bLL  +  b2L^  +  b3  (Z5Mt-1)  +  b^^)2  +  b5Pt  + 

+b6  Pt  +  b7  (PtLt)  ’  ^l5h^ 

j y  Computations  for  the  statistical  analyses  discussed  in  the  following 

two  sections  were  performed  on  the  CIT  G-20  computer  with  time 
financed  by  the  Graduate  School  of  Industrial  Administration.  In  this 
regard,  the  author  acknowledges  the  programming  assistance  of  Henry 
Townsend,  a  graduate  student. 
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where  1  =  and  ordinary  least  squares  estimates  of  the 

eight  coefficients  obtained  from  fifty-two  observations.  The  results 
of  this  analysis  are  sumnarized  in  Exhibit  1.  The  coefficient  of  multiple 


EXHIBIT  1 

ORDINARY  LEAST  SQUARES  ANALYSIS  OF  REGRESSION  MODEL  1,  EQUATION  [15] 


determination  adjusted  for  degrees  of  freedom  in  this  regression  was 
0.91  with  corresponding  F-statistic  of  74.  On  the  basis  of  the  multiple 
correlation  criterion  this  model  provides  good  estimates  of  costs, 
c  ,  however,  inspection  of  the  last  column  in  Table  1  indicates  that  only 


-JiJS5SS(g>-  -.. 


z^mmr 
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the  estimate  of  coefficient  has  a  significance  level  greater  than  0.9. 
This  result  is  partially  explained  by  reference  to  the  simple  correlation 
matrix  for  the  data.  That  is,  referring  to  the  simple  correlations 
between  the  independent  variables  x^  to  corresponding  to  to  P^L^ 
in  the  regression  model  of  [15],  the  following  correlations  exceed  a  value 
of  .  50: 

P12  -  p(Lt,  L*)  -  .99 

p17  -  p2,  -  .88 

P56  *  p(Pt’  Pt}  " 

p57  p67  •” 


We  conclude  therefore  that  multicollinearity  exists  between  the  first 
and  second  degree  terms  for  Lt  and  p  in  the  regression,  even  though 
the  actual  relation  between  these  variables  is  known  a  priori  to  be 
nonlinear.  Further  inspection  of  the  data  reveals  that  the  source  of 
this  difficulty  lies  in  the  narrow  range  of  the  observations  recorded  for 
these  variables,  viz. ,  the  variance  to  mean  ratios  for  the  observations 
on  L  and  P  are  .53  and  1.0,  respectively. 


If  multicollinearity  were  the  only  problem  in  the  regression  results, 
we  could  circumvent  the  difficulty  in  this  case  by  applying  a  linear 
transformation  to  the  variables  effected.  That  is,  consider  the 
regression 


zt‘a+  Pi  Yn+  1*2  Y2t  + 


U. 


[16] 
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2  — 

and  assume  Ylfc  =  gfc  and  Y2fc  =  gfc.  Let  ylfc  -  (Ylfc  -  Yj^)  and 
2 

y2t  =  ylfc,  and  consider  the  alternate  regression 

zt- Y+ Vn+ 8zy2t+ °t  •  [l7] 

The  relation  between  the  coefficients  in  [16]  and  those  in  [17]  is 
simply 

-  -  2  ~  -2 
<*  =  Y  “  8j_  Y1  +  &2  Y1  a  y  -  b1  g  +  &2  (g)  , 

S1  =  51  ‘  2  ?1  62  -  51  '  2  *  B2 
P2  “  62  • 

Subsequent  analyses  employing  least  squares  estimates  of  the  regression 
coefficients  can  now  be  implemented  based  on  the  results  obtained  from 
[17],  discarding  the  initial  regression  in  [16],  Revising  the  regression 
in  [15]  by  this  procedure  gives  the  following  changes  to  Table  1: 
new  constant  =  y  =  10.585.4  , 

new  coefficient  for  =  5^  *  123.56,  with  t  statistic  10.3, 

new  coefficient  for  =  5,.  *=  30.53,  with  t  statistic  5.95, 

the  remaining  coefficient  estimates  essentially  unaltered. 

However,  multicollinearity  is  not  the  only  difficulty  with  the 
regression  model  in  [15].  A  more  basic  problem  concerns  the  regression 
specification  and  subsequent  identification  of  the  cost  parameters  in 
the  original  model  of  [13b]  based  on  the  estimated  regression  coefficients. 
For  example,  even  if  for  convenience  we  assume  a  priori  that  cost 
coefficients  c^  and  c^  are  identically  zero,  so  that  the  number  of 
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remain  in  g  cost  coefficients  in  [13b]  equals  the  number  of  regression 

coefficients  in  [15],  the  initial  cost  model  is  over-identified.  That 

is,  referring  to  the  regression  model  we  see  that  the  value  of  cost 

coefficient  c^  can  be  either  1/2  (b^/b^)  or  It  can  be  shown 

in  this  case  that  neither  of  these  estimates  based  on  the  least  squares 

regression  will  be  a  maximum  likelihood  "best"  estimate  of  the 

1/ 

coefficient  value. 

One  approach  to  resolving  the  identification  problem  is  to 
introduce  restrictions  a  priori  on  relations  between  the  admissable  values 
of  the  regression  estimates.  Such  restrictions  could  be  incorporated 
into  the  regression  analysis  in  a  variety  of  ways.  For  example,  the 
restrictions  could  be  included  as  equality  constraints  on  the  regression 
parameters,  obtaining  constrained  least  squares  estimates  through  a 
quadratic  programming  analysis.  Alternatively,  additional  relations 
between  variables  could  be  introduced  into  either  the  cost  specification 
or  the  regression  model  until  an  exact  correspondence  between  the 
coefficients  was  realized.  As  in  the  above  example,  however,  exact 
identification  by  this  procedure  is  not  always  possible. 

An  allied,  though  separate,  problem  with  the  results  obtained  in 
our  initial  regression  analysis  concerns  the  question  of  admissable 
values  of  the  coefficient  estimates.  That  is,  referring  to  the  cost 

1/  An  interesting  modification  of  the  standard  regression  procedure 
which  yields  maximum  likelihood  estimates  when  the  parameters 
in  a  normal  regression  model  are  overidentified  has  been 
suggested  by  Lovell  [/].  Basically,  Lovell's  approach  seeks 
values  of  the  coefficients  which  minimize  the  standard  error  of 
estimate  while  applying  a  search  procedure,  such  as  the  Fibonacci 
routine,  over  the  range  of  values  for  the  overidentified  coefficient- 
in  this  case,  for  c^y 


specification  in  [13b]  it  is  clear  whatever  procedure  is  used  to 
obtain  coefficient  estimates,  c^ ,  we  require  that  the  estimated  cost 
equation  be  non-negative  for  all  positive  values  of  L  ,  W  and  P  . 

L  L  l> 

A  A 

For  example,  we  might  require  in  particular  that  c ^  >  0,  c2±  > 
c^  and  c^  >  0»  and  so  on.  Similarly,  we  may  possess  a  priori 
qualitative  information  on  the  range  of  admissable  values  for  certain 
coefficient  estimates  which  would  be  appropriate  to  include  within 
our  analysis  in  addition  to  the  observed  information  on  the  variables. 
One  approach  to  this  problem  could  be  to  introduce  constraints  and 
proceed  as  suggested  above  under  a  quadratic  programming  analysis. 
Another  approach  has  been  described  by  Theil  [8]  as  "mixed  estimation." 
In  Theil' s  framework  the  initial  regression  model  comparable  to  [15], say 


2  =  X  £  +  u  , 


where  X  represents  the  matrix  of  observational  information,  is 
augmented  by 


where  R  represents  the  matrix  of  non-observational  (qualitative) 
information.  The  generalized  least-squares  estimator  of  the  elements 
in  £  is  then 

£  =  (X'L-1  X  +  R'  e"1  R)-1  CX'  I"1  2  +  R*  h"'1  z)  , 


where  E  [u  u ' ]  =  E  ,  E[v  v ' 1  =  H ,  and  E  [ u  v ' ]  =  0  . 


[20] 
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Rather  than  pursuing  these  considerations  in  detail  to  resolve  the 
difficulties  in  the  initial  regression  model,  we  turn  our  attention  now 
to  a  different  approach  for  obtaining  cost  coeffipient  estimates. 


5.  Estimation  of  Simultaneous  Equation  Systems 


Consider  again  the  question  of  specifying  an  estimating  relationship 

for  operating  cost  in  the  HIA  model.  Ignoring  our  original  specification 

of  period  operating  costs  given  by  equation  [13b],  we  have  simply 

that  total  costs 
T 

C(T)  =  E  ,  C  (the  sum  of  operating  costs  in 

C  each  period),  [21] 

and  Cfc  =  [(hiring  and  layoff  ccsts)t  +  (regular  payroll  costs ) 

+  (overtime  costs 4-  (other  variable  production  costs )fc] 


'  Clt  +  C2t  +  C3t  +  C4t 


Retaining  our  earlier  definitions  on  the  variables ,  we  might  refine 
this  statement  of  operating  costa  by  stipulating  the  components  as 


C^t  -  “  ^t-1^  +  Ult  (hiring  and  layoff  costs), 

C2t  =  a2  +  ^21  Lt  +  U2t  (regular  payroll  costs)  , 

C3fc  =  a^+  f^L  »  Pfc)  +  u.jt  (overtime  costs)  , 

C4t  =  (*4  +  f4(Wfc  -  Lfc,  Pt  -  Pt-1)  +  u4fc  (other  variable 

production  costs), 


[22-1] 

[22-2] 

[22-3] 

[22-4] 


where  the  functions  f^(  •  )  are  quadratic  in  the  arguments  shown,  and 
the  variables  uifc  for  i=l,  ...,  4  correspond  to  disturbance  terms  for 
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which  we  assume 

EC uit 1  =  0  for  all  t  and  each  i 
(  2 

E[u.  u.  .]  =  it  for  j«=0,  all  t  and  each  i 
it  i,t+jJ  i  J 

0  for  j=/0,  all  t  and  each  i 

and  that  each  u  is  independent  of  the  predetermined  variables 
it  i/ 

(that  is,  Pfc,  Wt>  and  Lt). 

Clearly,  the  three  cost  components  in  [22-2],  [22-3],  and  [22-4] 
will  be  correlated,  since  they  contain  common  arguments  in  the  right- 
hand-side  relations.  For  example,  given  the  linear  expression  for 
regular  payroll  costs  in  equation  [22-2],  the  expression  for  overtime 
costs  can  be  rewritten  as 

c3t  *  “5  +  f5  (c2f  V  +  u3t 

and  it  is  apparent  that  c^t  will  be  influenced  by  the  disturbance  term 
U2fc*  More  generally,  if  strong  association  (i.e. ,  high  positive 
correlation)  exists  between  the  components  in  [22] ,  this  information 
becomes  lost  when  coefficient  estimates  are  obtained  from  a  regression 
employing  the  aggregated  model.  That  is,  the  disturbance  terms  are 
additive  between  components  and  a  corresponding  increase  occurs  in 
the  standard  errors  of  the  coefficient  estimates.  Better  results  can 
be  obtained  by  refining  the  estimation  procedure  to  take  into  account 
the  component  relationships,  either  individually  or  as  a  system  of 
equations. 

1/  We  will  also  assume  zero  autocorrelations  for  the  disturbance  terms 
for  each  i. 
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From  the  detailed  accounting  data  available  in  the  firm,  an 
initial  estimate  of  the  correlation  between  cost  components--and  hence, 
potential  cost  specifications — was  obtained  by  the  simple  regression 

Ct  ■  Y1  CU  +  V2  c2t  +  Y3C3t  +  V*t  +  “t  •  [ 


it 


where  each  of  the  assume  unit  values.  On  the  basis  of  this  analysis, 
it  was  decided  to  group  overtime  and  other  variable  costs  as  one 
component  and  to  consider  regular  payroll  cost  as  a  separate  component, 
since  the  former  showed  negligible  correlation  and  the  latter  indicated 
high  correlation  with  the  remaining  costs.  For  other  reasons,  principally 
because  a  different  accounting  basis  had  been  employed,  it  was  decided 
also  to  treat  hiring  and  layoff  costs  as  a  separate  cost  equation. 


From  these  and  earlier  considerations  a  variety  of  model 
specifications  were  considered  seriatim  for  each  of  the  cost  component 


relationships.  In  the  Interests  of  brevity  only  the  final  model  is 

1/ 

presented: 

c  =  d1  c.  +  d  c_  +  d_c  (estimated  operating  costs  for 

period  tj 

where 

c  =  b  (H  +  F  )2  +  b_(H  -  F  )2  (estimated  hiring  and 
1Z  L  t  Z  ^  t  t  layoff  costs) 


L 


t 


(estimated  regular  payroll  costs) 


[24] 

[24-1] 

[24-2] 


1 J  The  constant  2.76  appearing  in  the  first  term  of  equation  [24-3] 

corresponds  to  an  independent  estimate  of  the  labor  productivity  coef¬ 
ficient  for  direct  prodaction  work  force  obtained  from  available 
data  within  the  firm.  An  analysis  of  variance  for  different  levels  of  the 
workforce  and  production  accepted  the  constant  variance  hypothesis 
for  this  figure.  Had  this  estimate  not  been  available  an  equation 
could  have  been  added  to  the  model  expressing  production  as  a  function 
of  the  Work  force  level,  and  the  analysis  proceed  as  below. 


rms 
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A 

c_ 

DC 


A  +  b,  (P.  -  2.76  L. )2 

c  c 

+  b5Pt  +  b6PtLt  +  Vt 


(estimated  other  variable 
production  costs) 


+  b8(Wt  -  V 


+  b9  (Pt 


Pt-1> 
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Estimates  for  the  coefficients  in  this  model  were  obtained  using 

ordinary  two-stage  least  squares.  That  is,  first  the  coefficients  for 

the  cost  relations  expressed  by  the  system  of  equations  in  [24-1] 

to  [24-3]  which  contain  only  predetermined  variables  were  estimated  by 

taking  the  least-squares  regression  of  these  actual  costs  on  the  right- 

hand-side  relationships.  The  corresponding  cost  components  in  the 

original  equation  for  operating  costs  were  replaced  then  by  their 

estimated  values  from  the  first-stage  regression  and  least  squares 

1/ 

analysis  was  again  applied  to  this  reformulated  relation.  The 
estimating  model  obtained  by  this  procedure  was: 

c.  =  [17,607  +  1.55  (H  +  F  )2  +  1.41  (H  -  Fj2  -  137.7  L  [25] 

t  tt  tt  t 

+  0.15  (P  -  2.76  L.)2  -  72.22  P  +  1.05  PL  +  0.97  (W.  -  L  )2  - 

t  L  C  C  L  t 

-  0,26  <Pt  -  Pt-1)2} 

which  has  an  adjusted  coefficient  of  multiple  determination  of  .93 
with  corresponding  F-statistic  of  219.  The  relative  goodness-of-f it 
for  the  estimating  model  in  equation  [25]  is  illustrated  by  the  graphs 
of  Exhibit  2  which  trace  actual  and  estimated  operating  costs  at  HLA 
over  fifty-two  time  periods. 

1/  For  an  excellent  discussion  of  two-stage  least  squares  and  other 
simultaneous  equation  techniques  in  econometrics  see  Theil  [8], 

An  interesting  discussion  of  statistical  cost  analysis  within  the 
framework  of  economic  theory  which  reports  on  empirical  studies 
is  available  in  Johnston  [3]. 


EXHIBIT  2:  PERIOD  COMPARISON  OF  ACTUAL  AND  ESTIMATED  OPERATING  COSTS,  HLA  MODEL 


OPERATING  COST 


Within  the  context  of  our  original  problem  some  additional 
observations  are  worth  noting.  First,  our  earlier  concern  with 
identification  of  cost  coefficients  is  not  an  issue  in  the  above  model 


since  exact  correspondence  exists  between  the  regression  parameters 
and  model  coefficients.  Second,  the  quadratic  form  which  results 
from  the  coefficient  estimates  given  it.  equation  [25]  is  a  convex 
function,  and  hence  the  quadratic  prog~amming  solution  presented 
earlier  can  be  employed  directly.  Finally,  tests  for  the  reasonableness 
of  the  estimated  parameters  in  the  component  cost  relationships 
in  general  are  satisfied.  For  example,  basic  economics  suggests  that 
the  marginal  cost  of  overtime  should  be  positive  for  increasing 
production  and  constant  work  force  levels  and  conversely  negative 
for  increasing  work  force  and  constant  production.  Isolating  the 


estimated  overtime,  cost  relationship  at  the  first  stage  gives 
A  2 

c „  =  18437  +  0.16  (P,  -  2.76L  )  -  75.64  +  1.097  PL  - 

3t  t  t  t  t  t 

-  205.16  L 


This  function  is  convex  for  all  positive  values  of  Pfc  and  , 
(dc3t/dPt)  >  0  for  constant,  and  (3c3t/c)Lt)  <  0  for  ?t  constant. 

It  is  important  to  include  such  tests  for  the  reasonableness  of 
estimates  obtained  by  any  mechanical  procedure,  such  as  least  squares, 
since  typically  there  is  no  a  priori  guarantee  that  the  procedure  will 
not  indicate  norsense  results  when  the  estimated  model  ■'s  literally 


translated. 
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6.  Conclusion:  Sensitivity  Analysis  and  Cost  Estimation 

In  conclusion,  we  turn  our  attention  to  the  general  question  of 
errors  in  the  specification  and  estimation  of  the  objective  cost  function 
and  their  consequent  s.  The  considerations  outlined  below  are  expanded 
more  fully  in  Kriebel  [4]  and  van  de  Panne  and  Bosje  [10]. 

Recall  from  the  discussion  of  quadratic  programming  the  optimal 
decisions  a?<  which  minimize  the  objective  cost  function  c(a,  x)  are  given  as 

a*=  K  x  »  -  g'1  X  , 

where  c(a_,  x)  =  XQ(a)  +  2  a'  ^  (x)  +  a'g  a  and,  to  simplify  notation, 

X  =  X  (x).  The  minimum  cost  associated  with  the  implementation  of  a* 
is  proportional  therefore  to 

c(a*)  -  -  X  g”1  X  .  [27] 

Assuning  the  coefficients  are  equal  to  their  estimated  values,  the 
decision-maker  acts  in  accordance  with  a*.  Since  this  assumption  is  not 
valid  generally,  the  decision-maker  commits  a  decision  error.  We  can 
consider  this  decision  error  resulting  from  errors  in  the  specification 
or  estimation  of  the  cost  coefficients  as  a  perturbation,  say  &(a*),  about 
the  optimal  actions  a*.  That  is,  the  actual  non-optimal  decisions,  a, 
based  on  coefficient  errors  can  be  expressed  as 

a  =  a*  +  5( a*)  .  [28] 

This  decision  function  can  be  evaluated  implicitly  as  the  Taylor  series 
expansion 

a  =  a*  +  d (a* )  +  1  /2  d  (a* )+•••+  ”,  dJ  (£* )  +  *  •  •  ,  [ 29 ] 
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provided  this  series  converges.  Letting  R  represent  the  remainder 

terms  for  higher  order  differentials  in  the  series,  a  second  order 

1/ 

approximation  of  5(a*)  in  terms  of  the  original  model  is 
&(a*)  =  -  gml(dx  -  dgg‘\>  +  ql  djjg'1  (dX  -  d  gg^x) 

-12  2-1  ^30] 

-  1/2  g  (d  i  •  d  gg  1  i)  +  R  . 

The  increase  in  operating  cost  which  results  from  employing  a  is 
thus 

Ac(a)  =  X'g-1X  +  2  b'X  +  b'g  b  ,  [31] 

where  the  second  order  approximation  of  5(a*)  gives 

lb  =  -g_1[dx  +  x  +  1/2  (d\  -  d2  gg"1  x)-dgg_1  (x  +  dx  -  dgg"1  x)]  . 

The  practical  consequence  of  this  analysis  is  that  it  provides  the 
empiricist  with  guidelines  to  consider  when  obtaining  estimates  of 
the  individual  cost  elements.  Jbr  example,  in  the  HLA  model  if  we 
consider  the  cost  specification  for  e(a,  x)  as  given  by  equation  [13], 
then  for  X  =  D  x,  the  primary  coefficients  in  D  are  [c^*  C3^»  c32»  C33»  c5i» 

c61l'  C62i'  c631}  and  Chose  ln  2  are  (=21'  C3I ’  c32’  c51>  C611)’  Clearly' 
any  empirical  analysis  should  focus  attention  on  information  pertaining 

to  this  second  set  of  coefficients  and  the  es  imation  of  the  corresponding 

cost  elements.  On  tna  other  hand,  little  or  no  attention  should  be  devoted 

to  obtaining  estimates  of  the  coefficients  not  included  in  either  sub-set, 

\J  Reference  to  the  cost  specifications  considered  in  the  HLA  mode* 

indicates  that  the  equations  are  of  degroe  *  in  cos *  coefficients 
so  that,  in  fact,  only  fourth  and  higher  differentials  of  g  and  X. 
vanish  completely. 


BHftWMW^wirn  mmjm  i, 


•  •  **■»«%>'>  f,? 
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such  as  c^  in  [13],  since  they  will  have  no  bearing  on  the  analytical 
results.  Furthermore,  the  general  analysis  of  [30]  can  be  greatly 
simplified  if  we  bypass  the  simultaneous  occurrence  of  coefficient 
errors,  and  consider  the  consequences  of  errors  in  each  coefficient 
individually.  For  example,  again  referring  to  the  specification  in 
[13]  and  restricting  consideration  to  only  those  coefficients  which 
appear  linearly  in  X  or  (} ,  such  as  c j  and  respectively,  the 
evaluation  of  [30]  simplifies  to  the  first  order  differentials  in 
X  and  (},  and  the  Taylor  series  expansion  is  now  exact  for  these 
coefficients  upon  substituting  their  differences,  ^ ,  for  differentials. 

This  report  has  reviewed  a  number  of  considerations  in  statistical 
cost  estimation  as  these  problems  relate  to  empirical  studies  in 
operations  research.  To  aid  the  discussion,  the  empirical  issues  were 
illustrated  within  the  specific  context  of  quadratic  programming  and 
a  case  history  was  presented.  In  this  regard,  the  quadratic  programming 
model  was  selected  because  its  mathematical  structure  and  solution  can 
be  stated  readily,  the  estimation  of  its  parameters  is  a  nontrivial  problem, 
and  research  on  applications  (such  as  the  HMMS  analysis)  is  available 
and  documented.  Although  many  of  the  empirical  questions  have  only  been 
outlined,  the  discussion  has  helped  to  point  out  several  conclusions. 

First,  the  implementation  of  management  science  models  clearly 
requires  proportionate  attention  to  empirical,  as  well  as,  formal 
problems  of  analysis  even  when,  before  the  fact,  these  problem  areas 
may  appear  to  be  relatively  decoupled.  The  empirical  and  formal  analyses 
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tnteract  throughout  the  course  of  an  investigation  and  serve  to 
reinforce  recommendations.  For  example,  in  the  case  analysis  the 
initial  model  specification  of  equation  [13]  expressed  decisions  in  terms 
of  production  and  work  force  levels  P  and  Wfc  for  t»l,2,...,  T;  however, 
the  empirical  results  leading  to  the  final  specification  in  equation  [  25] 
necessitated  reformulating  the  model  in  terms  of  the  decisions  P^,  Ht> 

and  and  adding  the  definitional  constraint:  Vf  =>  W  .  +  H  -  F  , 

t  t  t-1  t  t 

for  t=l ,  2  ,  . . .  ,  T. 

Second,  extension  of  the  formal  analysis  at  the  outset  can 
substantially  assist  in  the  conduct  of  the  empirical  investigation. 

The  preceding  discussion  on  sensitivity  analysis  serves  as  a  good 
example  af  this  point.  That  is,  such  an  analysis  beforehand  can  help 
to  identify  the  priorities  that  should  be  considered  in  planning  the 
effort  to  obtain  estimates  of  model  parameters  and  relationships. 

Third,  within  the  empirical  study,  an  analysis  of  sampling  errors 
(such  as  the  covariance  matrix  for  the  random  disturbances  in  a 
regression)  provides  a  natural  basis  for  refining  the  procedures  by 
which  model  estimates  are  obtained,  e.g.  ,  the  rationale  leading  to  thi 
two-s.zage  least  squares  analysis. 

Finally,  qualitative  information  can  and  should  be  included  within 
the  empirical  analysis  in  addition  to  available  observational  data. 

In  this  regard,  recall  the  inclusion  of  an  independent  estimate  for 
the  labor  productivity  parameter  in  the  final  overtime  cost 
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specification  and  the  tests  on  the  reasonableness  of  the  derived 
estimates  at  the  conclusion  of  the  least  squares  analysis. 

As  more  and  more  decision  making  procedures  are  programmed  for 
electronic  computers,  and  these  programs  are  extended  within  the  firm, 
the  empirical  problems  of  data  analysis  and  estimation  will  become  the 
increasing  concern  of  the  management  scientist. 
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